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(54) Router monitoring in a data transmission system utilizing a network dispatcher for a cluster 
of hosts 



(57) Data transmission system for exchanging 
packetized data between any IP host amongst a cluster 
of IP hosts (10, 12, 14) having each at least an IP layer 
and a network layer and a plurality of workstations (16, 
1 8, 20) by the intermediary of an I P network (22) .where- 
in each I P host is connected to the I P network via a layer 
2 network (24) such a LAN interfacing the IP network by 
a set of routers (26, 28) and by a network dispatcher 



(34) in charge of receiving all incoming flows from the 
workstations and dispatching them amongst the cluster 
of hosts. The system comprises at least a monitoring 
device (36) included in the cluster of hosts comprising 
means for monitoring the availability of the candidate 
routers and means lor broadcasting the router availabil- 
ity information to each host of the cluster of hosts via 
the network dispatcher (34). 
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Description 

Technical field 

[0001] The present invention deals with a new way for 
obtaining high availability and load balancing on default 
routers lor IP host systems, and relates in particular to 
a router monitoring in such a system utilizing a network 
dispatcher for a cluster of hosts. 

Background 

[0002] Several types of digital networks operating 
with the packet switching technique in which data from 
different origins are chopped into fixed or variable length 
packets or datagrams have been installed throughout 
the world, which need to be interconnected (e.g. via so 
called Routers) to optimize the possibilities of organizing 
traffic between source hosts and target hosts located 
anywhere in the world. This is made possible by using 
so-called internetworking. 

[0003] Internetwork (also referred to as internet) facil- 
ities use a set of networking protocols such as Trans- 
mission Control Protocol/internet Protocol (TCP/IP) de- 
veloped to allow cooperating host computers to share 
resources across the internetwork. This is made possi- 
ble by using so-called internetworking. 
[0004] Internetwork (also referred to as internet) facil- 
ities use a set of networking protocols such as Trans- 
mission Control Protocol/Internet Protocol (TCP/IP) de- 
veloped to allow cooperating host computers to share 
resources across the internetwork. TCP/IP is a set of 
data communication protocols that are referred to as in- 
ternet protocol (IP) suite. Because TCP and IP are the 
best known, it has become common to use the term 
TCP/IP to refer to the whole protocol family. TCP and IP 
are two of the protocols in this suite. Other protocols of 
the suite are User Datagram Protocol (UDP), Address 
Resolution Protocol (ARP), Real Time Protocol (RTP) 
etc... 

[0005] An internet may thus be a collection of hetero- 
geneous and independent networks using TCP/IP, and 
connected together by routers. The administrative re- 
sponsibilities for an internet (e.g. to assign IP addresses 
and domain names) can be within a single network 
(LAN) or distributed among multiple networks. 
[0006] When a communication of data has to be es- 
tablished from a source host to a particular IP destina- 
tion over an IP network, there is a number of methods 
to determine the first hop router of the network towards 
this destination. These include running (or snooping) dy- 
namic routing protocol such as Routing Information Pro- 
tocol (RIP) or Open Shortest Path First (OSPF) version, 
running an ICMP router discovery client or using a stat- 
ically configured default route. 

[0007] Running a dynamic routing protocol on every 
end-host may be infeasible for a number of reasons, in- 
cluding administrative overhead, processing overhead, 



security issues, or lack of a protocol implementation for 
some platforms. Neighbor or router discovery protocols 
may require active participation by all hosts on a net- 
work, leading to large timer values to reduce protocol 
5 overhead in face of large numbers of hosts. This can 
result in a significant delay in the detection of a lost (L 
e., dead) neighbor, which may introduce unacceptably 
long «black hole» periods. 

[0008] The use of a statically configured default route 
io is quite popular, it minimizes configuration and process- 
ing overhead on the end-host and is supported by virtu- 
ally every IP implementation. This mode of operation is 
likely to persist as Dynamic Host Configuration Proto- 
cols (DHCP) are deployed, which typically provide con- 
*5 figuration for an end-host IP address and default gate- 
way. However, this creates a single point of failure. Loss 
of the default router results in a catastrophic event, iso- 
lating all end-hosts that are unable to detect any alter- 
nate path that may be available. 
20 [0009] One solution to solve this problem is to allow 
hosts to appear to use a single router and to maintain 
connectivity even if the actual first hop router they are 
using fails. Multiple routers participate in this protocol 
and in concert create the illusion of a single virtual rout- 
es er. The protocol ensures that one and only one of the 
routers is forwarding packets on behalf of the virtual 
router. End hosts forward their packets to the virtual 
router. The router forwarding packets is known as the 
active router. A standby router is selected to replace the 
30 active router should it fail. The protocol provides a mech- 
anism for determining active and standby routers, using 
the IP addresses on the participating routers. If an active 
router fails, a standby router can take over without a ma- 
jor interruption in the host's connectivity. 
35 [0010] Another similar approach is the use of Virtual 
Router Redundancy Protocol (VRRP) designed to elim- 
inate the single point of failure inherent in the static de- 
fault routed environment. VRRP specifies an election 
protocol that dynamically assigns responsibility for a vir- 
40 tual router to one of the VRRP routers on a LAN. The 
VRRP router controlling the IP address(es) associated 
with a virtual router is called the Master, and forwards 
packets sent to these IP addresses. The election proc- 
ess provides dynamic fail-over in the forwarding respon- 
ds sibility should the Master become unavailable. Any of 
the virtual router's IP addresses on a LAN can then be 
used as the default first hop router by end-hosts. The 
advantage gained from using VRRP is a higher availa- 
bility default path without requiring configuration of dy- 
50 namic routing or router discovery protocols on every 
end-host. 

[0011] Unfortunately the two above solutions cannot 
provide load balancing for a given host's traffic because 
only the router that answered the ARP is used. Also, 
55 customers are reluctant to change their main router con- 
figuration to enable such a function. 
[0012] It is why in IBM docket FR 9 99 008 filed as a 
European patent application, an IP source host is pro- 
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vided with a new layer between the IP layer and the net- 
work layer for selecting dynamically a router amongst a 
set of candidate default routers, thereby ensuring both 
load balancing and high availability. 
[001 3] Unfortunately, in case of a configuration with a 
network dispatcher used as a front end to a cluster of 
hosts, a host will always receive incoming packets from 
the network dispatcher in response to ARP requests, as 
opposed to packets from the candidate routers. There- 
fore, it is not possible to maintain the status ol active 
candidate routers by resetting the age of an entry in the 
ARP table each time a packet is received from a match- 
ing network (MAC) address as in IBM docket FR 9 99 
008 The only solution is to issue periodic ARP requests 
to candidate routers with the drawback that all the hosts 
have to monitor all the individual routers. 

Summary of the invention 

[0014] Accordingly, the object of the invention is to 
provide a specific device for monitoring all the candidate 
routers in a data transmission system wherein a cluster 
of hosts is associated to a network dispatcher receiving 
all the incoming flows from an IP network. 
[0015] Another object of the invention is to achieve a 
method of determining the availability of candidate rout- 
ers in a data transmission system wherein a cluster of 
hosts is associated to a network dispatcher receiving all 
the incoming flows from an IP network. 
[001 6] The invention relates therefore to a data trans- 
mission system for exchanging packetized data be- 
tween any IP host amongst a cluster of IP hosts having 
each at least an IP layer and a network layer and a plu- 
rality of workstations by the intermediary of an IP net- 
work: wherein each IP host is connected to the IP net- 
work via a layer 2 network interfacing the IP network by 
a set of routers and by a network dispatcher in charge 
of receiving all incoming flows from the workstations and 
dispatching them amongst the cluster of hosts. Such a 
system comprises at least one monitoring device includ- 
ed in the cluster of hosts comprising means for monitor- 
ing the availability of the routers and means for broad- 
casting the router availability information to each host of 
the cluster of hosts via the network dispatcher. 

Brief description of the drawings 

[001 7] The above and other objects, features and ad- 
vantages of the invention will be better understood by 
reading the following more particular description of the 
invention in conjunction with the accompanying draw- 
ings wherein : 

Fig. 1 represents schematically a data transmission 
system wherein a cluster of hosts incorporates a 
specific device for monitoring the availability of a set 
of routers according to the invention, 
Fig. 2 is a flow chart of the method implemented in 



the invention for monitoring the availability of the 
routers. 

Detailed description of the invention 

[0018] In reference to Fig. 1, the invention is imple- 
mented in a data transmission system wherein a plural- 
ity of IP Hosts 10, 12, 14 transmit data to one or several 
workstations 16, 18, 20 via an IP network 22 by means 
of a layer 2 network such as a Local Area Network (LAN) 
24. LAN 24 is interfacing IP network 22 by a set of input 
routers such as routers 26, 28. The IP packets are rout- 
ed over the IP network via a plurality of routers (not 
shown) until output routers such as routers 30, 32 con- 
nected to workstations 16, 18, 20. 
[0019] Instead of using a single default router to trans- 
mit data over the IP network, a technique described in 
IBM docket FR 9 99 008 consists for a host in using a 
new layer between the IP layer and the network layer, 
this additional layer being in charge of selecting one 
amongst a set of candidate routers such as routers 26 
or 28 by running an algorithm based upon parameters 
defined in the packet which is transmitted. 
[0020] In the present invention, it is assumed that the 
hosts 10, 12, 1 4 are grouped in a cluster associated with 
a network dispatcher interfacing LAN 24 with IP network 
22. Such a network dispatcher (ND) is a solution to the 
problems of keeping the load evenly spread or balanced 
on a group of hosts (or servers). It acts as a dispatcher 
of connections from users who know a single IP address 
for a service, to the set of hosts 10, 12 and 14 which 
actually perform the work. Only the packets going from 
the users such as workstations 16, 18 and 20 pass 
through network dispatcher 34. The packets from an IP 
host to a workstation may go by other routes which need 
not include the network dispatcher 34, thereby reducing 
the load on a network dispatcher and allowing it to po- 
tentially stand in front of a larger number of hosts. 
[0021] Since the cluster of hosts is seen by the users 
as the single address of the network dispatcher, it is 
therefore impossible to determine the availability of a 
router 26, 28 amongst the set of routers interfacing the 
IP network by only monitoring the data packets received 
from the IP network as mentioned above. The solution 
of the invention is therefore to add a router monitoring 
device (RM) 36 as a new member of the cluster of hosts. 
Instead of requiring each IP host to send ARP requests 
to each candidate router 26 or 28 in order to determine 
the availability of the latter, RM device 36 is in charge 
of sending periodically (the period can be short, e.g. 1 
to 10 seconds in order to ensure the best service) a uni- 
cast ARP request to all the candidate routers, and then 
to inform all the IP hosts about the availability status of 
each router using a broadcast ARP response. Thus, this 
function is performed with a minimum traffic and the 
number of IP hosts (or servers) using the set of candi- 
date routers can scale up without increased control traf- 
fic. 
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[0022] It must be noted that the function of such router 
monitoring device 36 may be integrated in one of the IP 
hosts. Furthermore, there can be several router moni- 
toring devices or several IP hosts achieving this func- 
tion, s 
[0023] Referring now to Fig. 2, the steps implemented 
in the invention are as follows. First, an ARP request 
(preferably a unicast request to all candidate routers) is 
sent to a router (step 40) on a periodical basis by router 
monitoring device 36. Then , it is checked (step 42) 
whether an answer is received by RM device 36 from 
the router. If so. this means that the router is available 
and an ARP response packet is sent by the RM device 
to ail IP hosts (step 44). This response is preferably a 
MAC level broadcast indicating the IP address and the 
MAC address of the candidate router which has been 
requested as information indicating the availability of the 
router. This response forces all the IP hosts to update 
their corresponding entry in the ARP table (step 46). 
[0024] When no answer is received from the candi- 
date router being requested (step 42), a test is made 
(step 48) to determine whether a decision factor is 
reached. For example, a router which fails to answer 
three times in a row can be declared unavailable. But, 
the decision factor could be a different one. Assuming 
the decision is reached : the RM device sends (step 50) 
an ARP response as a MAC level broadcast to all the 
IP hosts. This response indicates the IP address of the 
router and its MAC address set to a default value such 
as all zeroes as information indicating the unavailability 
of the router. This forces all the IP hosts to update their 
ARP table (step 52) by removing the ARP entry corre- 
sponding to the unavailable router after recognizing the 
invalid MAC address. Note that the entry can be updated 
with the invalid MAC address (e.g. all zeroes) rather 
than removing the entry. 



Claims 

1. Data transmission system for exchanging pack- 
et ized data between any IP host amongst a cluster 
of IP hosts (10, 12, 14) having each at least an IP 
layer and a network layer and a plurality of worksta- 
tions (16, 18, 20) by the intermediary of an IP net- 
work (22), wherein each I P host is connected to said 
I P network via a layer 2 network (24) interfacing said 
IP network by a set of routers (26, 28) and by a net- 
work dispatcher (34) in charge of receiving all in- 
coming flows from said workstations and dispatch- 
ing them amongst said cluster of hosts; 
said system being characterized in that it comprises 
at least a monitoring device (36) included in said 
cluster of hosts comprising means for monitoring 
the availability of said routers and means for broad- 
casting the router availability information to each 
host of said cluster of hosts via said network dis- 
patcher. 
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2. Data transmission system according to claim 1, 
wherein said at least monitoring device (36) is in- 
corporated in one of said cluster of IP hosts (1 0, 1 2, 
14). 

3. Data transmission system according to claim 1 or 
2, wherein said means for monitoring availability of 
said routers send periodically a unicast ARP re- 
quest to all the candidate routers (26, 28). 

4. Data transmission system according to claim 3, 
wherein said unicast ARP request to all the candi- 
date routers (26, 28) is sent on a periodic basis com- 
prised between 1 and 10 seconds. 

5. Data transmission system according to any one of 
claims 1 to 4, wherein said means for broadcasting 
the router availability send a MAC level broadcast 
indicating the IP address of the router being re- 
quested and an information on the availability of 
said router. 

6. Data transmission system according to claim 5, 
wherein said information on the availability of the 
router is the MAC address of said router when this 
one has answered and is available. 

7. Data transmission system according to claim 6, 
wherein said IP hosts (10,12,14) update their ARP 
table when receiving the MAC address of said rout- 
er being requested. 

8. Data transmission system according to claim 5, 
wherein said information on the availability of the 
router is a default value like all zeroes of the MAC 
address of said router when said router is consid- 
ered unavailable. 

9. Data transmission system according to claim 8, 
wherein said IP hosts (10, 12, 14) update their ARP 
table by removing the corresponding entry or writing 
said default value when said router is considered 
unavailable. 

10. Data transmission system according to claim 8 or 9 
wherein said router being requested is considered 
unavailable when it has not answered three moni- 
toring requests in a row from said router monitoring 
device (36). 

11 . Method of determining the availability of candidate 
routers in a data transmission system for exchang- 
ing packet ized data between any IP host amongst 
a cluster of I P hosts (10,12,14) having each at least 
an IP layer and a network layer and a plurality of 
workstations (16, 18, 20) by the intermediary of an 
IP network (22), wherein each IP host is connected 
to said IP network via a layer 2 network (24) inter- 
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facing said IP network by a set of routers (26, 28) 
and by a network dispatcher (34) in charge of re- 
ceiving all incoming flows from said workstations 
and dispatching them amongst said cluster of hosts, 
said method being characterized in that a unicast 
ARP request is sent periodically to all candidate 
routers and a MAC level broadcast is then transmit- 
ted to all IP hosts for them to update their ARP table 
with the router information on the availability. 

12. Method according to claim 11, wherein said infor- 
mation on the availability of the router is the MAC 
address of said router when this one has answered 
and is available. 

1 3. Method according to claim 1 2, wherein said IP hosts 
(10, 12, 14) update their ARP table when receiving 
the MAC address of said router being requested. 

14. Method according to claim 11, wherein said infor- 20 
mation on the availability of the router is a default 
value like all zeroes of the MAC address of said 
router when said router is considered unavailable. 

15. Method according to claim 14, wherein said IP hosts 2s 
(1 0, 1 2, 1 4) update their ARP table by removing the 
corresponding entry or writing said default value 
when said router is considered unavailable. 

16. Method according to claim 14 or 15. wherein said 30 
router being requested is considered unavailable 
when it has not answered to three monitoring re- 
quests in a row from said router monitoring device 
(36). 
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